Computer and Modernization ›› 2010, Vol. 1 ›› Issue (6): 137-0139.doi: 10.3969/j.issn.1006-2475.2010.06.039
• 网络与通信 • Previous Articles Next Articles
LI Zhong-yuan, YANG Shou-wen
Received:
Revised:
Online:
Published:
Abstract: This paper uses the classical vector space model for text classification Web page. The weighting of traditional TFIDF formula exists some problems, such as the Web page keywords calculation, the differentiation between keywords is not high. This Web page structure is divided into two parts, one part containing the title, meta data, link anchor documents and Web pages keywords, another part containing the Web page body, and the weighting of the keywords is strengthened. Because the part of page body calculation adopts the improved IDF, so the keywords in the class differentiation effect are promoted to a certain extent. After the test, it proves that the method is feasible.
Key words: VSM, feature representation, TFIDF
CLC Number:
TP393
LI Zhong-yuan;YANG Shou-wen. Improvement of Weight of Web Page Features in Calculation Based on VSM[J]. Computer and Modernization, 2010, 1(6): 137-0139.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://www.c-a-m.org.cn/EN/10.3969/j.issn.1006-2475.2010.06.039
http://www.c-a-m.org.cn/EN/Y2010/V1/I6/137